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Abstract 

As the multivariate statistical methods enjoy its “blooming” in the last two decades because of 
the ease of computation provided by computer software, efforts also have been made to enhance 
the interpretation of its results. Applying commonality analysis to canonical correlation analysis, 
the most general multivariate linear model, is part of that trend. In a paper presented at the 
annual meeting of the Southwest Educational Research Association, Thompson & Miller 
generalized the commonality analysis for use in canonical correlation analysis (Thompson & 
Miller, 1985). Several papers, since then, have discussed different aspects of that extension — 
canonical commonality analysis. The present paper seeks to summarize the findings and, using a 
data set, to demonstrate how canonical commonality analysis sheds light on interpretation of the 
model effect and provides guideline for deletion of predictors in canonical correlation analysis. 
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Introduction 

Multivariate statistical methods have been available to researchers for many decades, 
however, they were not used as frequently as it should be until the last twenty years. With 
reference to canonical correlation analysis, Thompson & Miller (1985, p. 1) credit the revival of 
the utilization of multivariate statistics to “its computerization and inclusion in major statistical 
package.” On top of the ease of access, researchers in social science also find multivariate 
methods appropriate, and even mandatory in some circumstance, in views of the intertwining 
nature of the factors involving in the phenomena under investigation. Campbell (1992, pp 1-2) 
goes into length to lay out the importance of utilizing multivariate statistics, quoting from other 
authors like Fish, Huberty & Morris, LeCluyse, and Thompson. She lists three reasons that call 
for multivariate statistics: (a) it controls the experiment-wise Type I error rates (b) It detects 
statistically significant results that univariate statistics may miss and actually exist (c) it best 
honors the complexities of reality. Campbell points to the third reason as the most important. The 
complexities of reality that call for multivariate models, in which causes have multiple effects 
and effects have multiple causes, present also difficulties in output interpretation. Researchers 
are not only interested in the overall effect of the model. They also need to know where does the 
effect come from, among the multiple causes, in order to interpret the output adequately. 
Commonality analysis has been proven useful in interpreting multiple regression output, as it 
helps researchers find out the unique and shared contributions from all the predictors in the 
regression model. Efforts have been made to extend it to multivariate analysis to enhance the 
interpretation of the results. 

Knapp (1978) demonstrated that canonical correlation analysis as the most general 
parametric test that subsumes not only the multivariate analyses but all other parametric tests as 
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special cases. Thompson (1988) illustrated, with a hypothetical data set, that canonical 
correlation analysis gives the same results as multiple regression. If canonical correlation 
analysis subsumes multiple regression as a special case, commonality analysis should be useful 
to the interpretation of its results just as it is useful for multiple regression. In a paper presented 
at the annual meeting of the Southwest Educational Research Association, he generalized the 
commonality analysis for use in canonical correlation analysis (Thompson & Miller, 1985). 
Several papers, since then, have discussed different aspects of that extension — canonical 
commonality analysis. The present paper seeks to summarize the findings and, using a data set, 
to demonstrate how canonical commonality analysis sheds light on interpretation of the model 
effect and provides guideline for deletion of predictors in canonical correlation analysis. A brief 
revision on commonality analysis should be a logical place to start. A mechanism of defining and 
computing the components of variance of the dependent variable regresses against k predictors 
will be discussed with example. Then, attention will be given to some important issues of doing 
commonality analysis. The latter part of the paper is devoted to implementation of commonality 
analysis in canonical case. 

Commonality Analysis for Multiple Regression 
Commonality analysis is a method for partitioning R 2 in multiple regression, the 
explained variance in the dependent variable, into constituents associated with the unique effects 
of each predictor variables and the common effects of any combination of the predictors. This 
method was originally suggested by Kempthome in 1957. It has been called different names: 
“element analysis” by Newton and Spurred, “components analysis” by Mayeske et al. The term 
“commonality analysis” was first used by Mood in 1971 (Daniel, 1989). Commonality analysis 
helps the researcher to identify the relative importance of all predictor variables in the regression 
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model by showing the percentage of dependent variable variance contributed by each of them 
uniquely and by two or more of them commonly. The predictor variables in the model could be 
the independent variables representing the main effects or the product variables representing 
interaction effects. For the sake of clarity, regression model without product variables will be 
discussed first and leaves interaction effects for later discussion. 

Defining and Computing Components of the D.V.'s Explained Variance 
Each component in the partition of the explained variance of the dependent variable 
under commonality analysis associates with one or more of the k predictors in the regression 
model. Since the number of ways of choosing at least one object from k objects is 2 k -l, the 
commonality analysis of a k-predictor regression model has 2 k -l components. Among those 
components, k of them depict unique effects and the remaining 2 k -k-l of them are with common 
effects. For example, in a 2-predictor regression model, the number of components identified in 
commonality analysis is 2 2 -1=3. Let y be the dependent variable and a and b be the two 
predictors. The three components will then be U a , Ub for the unique effects, and C a b for the 
common effect. Since U a is the variance associated only to a and not share with b, U a can be 
calculated by subtracting from the explained variance R 2 the squared correlation between y and b 
with a partial out. Ub can be calculated in the similar way. C a b can be easily found by subtracting 
U a and Ub from R 2 . 

U a = R 2 - R 2 y b 
U b = R 2 - R 2 y . a 
Cab = R 2 - U a - U b 

For model with three or more predictors, however, the procedure is not that straight forward. 
Wisler and Mood (1969) have developed a polynomial approach of writing commonality 
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formulas for any number of predictor variables. Seibold and McPhee (1978) gave a step-by-step 
presentation of this approach and a summary of the computational formulas for calculating 
common effects in the two-, three-, four-, and five-predictor models. 

This approach uses a product of k factors to represent each commonality component for 
k-predictor regression model. Two kinds of factors are included in the product, Xj and (1 - Xj), 
where XjS represent those variables that do not involved in the unique or common effect under 
consideration, and XjS represent those variables that do. In a four-predictor model, for example, 
the product for Ci 34 , the component common to variables 1, 3, 4, is -(1 - Xi)X 2 (l - X 3 )(l - X 4 ), 
and the product for Ui, the unique effect of X|, is -(1 - Xi)X 2 X 3 X 4 . In general, for the 
commonality. component Cpq r , associated with three predictors p, q, and r, of a k-predictor 
regression model, the product is, Cpq r : -Xi...(l-Xp)...(l-Xq)...(l- X r )...Xk The commonality 
component involved all predictors will be represented by -(1 - Xi)(l - X 2 )...(l - X k _i)(l - X k ). 
When each product is expanded, it becomes a polynomial with two to 2 k terms. Each term in 
these polynomials is a monomial of some X r s, where 1 < r < k. These polynomials are then 
transformed into formula for calculating the variance components. The terms in the polynomials 
are turned into squared part correlations between the dependent variable and the predictor 
variables included in that term with the rest of the predictor variables controlled, e.g., the 
variance component associated to the unique effect of Xi, namely Ui, is therefore calculated by: 
formula U, = R 2 - R 2 y . 234 derived from polynomial: -(1 - X,)X 2 X 3 X 4 = X,X 2 X 3 X 4 - X 2 X 3 X 4 
Where X]X 2 X 3 X 4 points to the total explained variance, model R 2 , involving all four predictor 
variables. X 2 X 3 X 4 points to R 2 y . 234> the squared part correlation between y and predictors 2, 3, 4 
with predictor 1 controlled. The squared part correlations can be obtained by the running 
multiple regression models that includes only the predictors that are involved in the part 
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correlations. Rowell (1995) pointed out that SAS provides a useful procedure command (PROC 
RSQUARE) that prints out the R 2 values of all possible combinations of the independent 
variables in the model. This SAS procedure has greatly simplified the calculation of the variance 
components. Once the squared part correlations are obtained, the computation of the components 
is straightforward and involves only addition and subtraction. Formulae for computing 
commonality components of a three-predictor regression are given below as an example. 
Components of Explained Variance in a Three-Predictor Regression Model 

In the Venn diagram (fig. 1), the rectangle represents the total variance in the dependent 
variable. The circle represents the part that explained by the predictors. The seven (2-1=7) 
commonality components associated to the three predictors partition the explained variance. 




Figure 1. Unique and Common Components of Explained 
Variance in Regressions for Three Predictors 



Following the polynomial approach, the components are: 
U, : -(1 -X 1 )X 2 X3 = X 1 X 2 X 3 -X 2 X 3 
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U 2 : -X,(l - X 2 )X 3 = X,X 2 X 3 - X,X 3 

U 3 : -X,X 2 (1 -X 3 ) = X,X 2 X 3 -X,X 2 

C , 2 : -(1 - X,)(l - X 2 )X 3 = X,X 3 + X 2 X 3 - X 3 - X,X 2 X 3 

C, 3 : -(1 - X,)X 2 (1 - X 3 ) = X,X 2 + X 2 X 3 - X 2 - X,X 2 X 3 

C 23 : - X,(l - X 2 )(l - X 3 ) = X,X 2 + X,X 3 - X, - X,X 2 X 3 

Ci 23 : -(1 - X,)(l - X 2 )(l - X 3 ) = -1 + X, + X 2 + X 3 - X,X 2 - X,X 3 - X 2 X 3 + X,X 2 X 3 
The formulae to compute the components are as follow: 

U, = R 2 - R 2 y 23 
U 2 = R 2 - R 2 y l3 
U 3 = R 2 - R 2 y . 12 

Cl 2 = R 2 y.l3 + R 2 y.23 - R 2 y.3 * R 2 

Cl 3 = R 2 y .i2 + R 2 y.23 - R 2 y.2 - R 2 

C23 = R 2 y.l2 + R 2 y.l3 ‘ R 2 y. I * R 2 

Cl 23 = R 2 y. 1 + R 2 y.2 + R 2 y.3 " R 2 y.l2 * R 2 y.l3 " R 2 y.23 + R 2 

Model with more predictors follows the same rule. Seibold and McPhee (1979) provide the 
formulae for models up to 5 predictors. 

Issues of Commonality Analysis 

There are several issues of commonality analysis need to be addressed. First of all, the 
number of components increases exponentially as the number of predictors in the model 
increase. It is practical to keep the number of predictors to four or fewer. This may be 
accomplished by grouping of variables, or selecting best predictors through a series of 
preliminary analyses, e.g. through factor analysis. In the grouping of variables, theoretical 
support and the natural relationship of the variables should be concerned. Second, Distinction 
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between common effects in the partition of variance and interaction effects in the regression 
model should be maintained. Interaction is the unique effect of two or more independent 
variables that in combination affect the dependent variable. Commonality indicates the 
proportion of predictive ability of a single variable that also happens to reside in another single 
predictor variable too; no unique effect of the predictors acting in combination is involved 
(Thompson, 1985). The difference between the interaction effect and common effect is 
demonstrated by fig. 2 and fig.3 below. In fig.2 the interaction effect predictors 1 and 2 is 
represented in the model by a product variable 1*2, which is treated as a third predictor, Ui* 2 - In 
the explained variance there are components common to the main effect predictors and the 
interaction effect predictor. On the other hand, there is only common effect in fig.3. The variance 
associated with the interaction effect is now part of the unexplained or error variance. 





Figure 2. Regression model with interaction 
effect presented. 



Figure 3. Regression model with main effects 
only 



Finally, it should be noted that negative commonality might occur. Negative 
commonality frequently indicates the presence of suppressor effect. It happens when highly 
correlated variables affect each other in opposite directions. That explains why sometime the 
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unique contribution of a predictor to the explained variance is bigger than the overall 
contribution of that predictor when another predictor is presented in the model. 

Canonical Commonality Analysis 

Tabachnick and Fidell (1996, p. 1 95) maintain “the easiest way to understand canonical 
correlation is to think of multiple regression.” In multiple regression, several variables on one 
side are combined linearly to predict a single variable on the other side. The linear composition 
of variables applies also to canonical correlation except that there are several variables on both 
sides. In the simplest terms, canonical correlation analysis is a way to assess how two sets of 
variables relate to each other. The canonical correlation, the model effect, is the correlation 
between the pair of composites of variables that are most highly correlated. There could be as 
many pairs of variable composites, which are called canonical functions or variates, as the 
number of variables in the smaller set of the two. Canonical commonality analysis is applying 
commonality analysis to partition the variance of a variate of a set of variables (conveniently 
called the criterion variate) associated to the variables in the other set (conveniently called the 
predictor variables). It is an extension of the commonality analysis in multiple regression. Since 
canonical correlation is a symmetrical measure, the relation can be stated in both directions. The 
difference here is that instead of applying the analysis to the variance of one dependent variable, 
it is applied to a variate, a linear composite of a set of variables. A partial set of variables from 
the Holzinger’s data is used here to demonstrate the steps of doing the analysis and the 
interpretation of the results. 

An Example 
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The data used here is a part of the Holzinger’s study on ability batteries of 301 students. 
Two sets of are chosen, t5, t6, t9 on one side and t20, t22, t24 on the other. The descriptions of 
the variables are as follow: 

T5 GENERAL INFORMATION VERBAL TEST 
T6 PARAGRAPH COMPREHENSION TEST 
T9 WORD MEANING TEST 
T20 DEDUCTIVE MATH ABILITY 
T22 MATH WORD PROBLEM REASONING 
T24 WOODY-MCCALL MIXED MATH FUNDAMENTALS TEST 
The first group of variables measures verbal ability, and the second group measures math 
problem solving ability in general. Canonical correlation analysis will be run to find out the 
relationship between the math problem solving ability and verbal ability, and then commonality 
analysis will be performed to help to shed light to the interpretation. 

Steps of Canonical Commonality Analysis 
The first step in Canonical Commonality Analysis is to run the canonical correlation 
analysis. The syntax of the SPSS canonical correlation analysis program is presented in appendix 
A. Three pairs of variates are created. The dimension reduction analysis shows that only one 
variate on the criterion side has significant effect size. Since the effect size decreases from 
variate 1 to variate 3, only variate 1 has a significant effect size. Therefore, the rest of the 
analysis will be performed on variate 1 only. Table 1 gives the summary of the analysis on 
variate 1. 

INSERT TABLE 1 ABOUT HERE 
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The canonical correlation for the pair of variate 1 is 0.37883. Which means that about 
38% of the variance in the criterion variate 1 is explained by the predictor variables t5, t6, and t9 
as a group. From the adequacy coefficients and the squared structural coefficients, one can see 
that both groups of variables contribute highly to the variance in their respective composite 
variables, i.e. the criterion variate and the predictor variate. The unique or common contribution 
of individual predictors on the criterion’s variance, however, can’t be determined. All one can 
says with the information acquired thus far would be the verbal ability variables together predict 
about 38% of the math problem solving ability. Commonality analysis is in place here to help to 
find out the contributions of individual predictors. 

The aim of canonical commonality analysis is to understand the partition of variance of 
the criterion variate explained by the predictors. The second step of the analysis is, therefore, to 
compute the criterion variate scores. The standardized canonical coefficients (labeled Func in 
table 1) and the standard scores of the dependent variables are used to compute the criterion 
variate scores using the formula: CRIT1 = -.264 * zt20 + -.561 * zt22 + -.461 * zt24. (Please 
refer to appendix A for the SPSS syntax). The criterion variate score for each of the 301 students 
is computed and saved under the variable name CRITl. 

The third step of the canonical commonality analysis is to conduct multiple regressions 
on the variates using all possible sets of predictors. There are seven (2 3 - 1 =7) possible 
combinations of using at least one of the three predictors in the multiple regressions. In SPSS, 
the regressions have to be run one by one. There is a useful procedure command in SAS, PROC 
RSQUARE that allow the user to compute all those regressions in one shot. For this example, 
The SPSS syntax for the regressions run is also given in appendix A. The R 2 s are presented in 
table 2. 
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INSERT TABLE 2 ABOUT HERE 

Step four is to compute the commonalities by substituting the value of the R 2 s into the 

commonality formulae in p.8. The result is tabulated in table 3. From table 3, all of the predictors 

\ 

have very small unique effect, e.g. unique effect of predictor t5 is only 0.002 that is only 0.2% of 
the total variance in the variate CRIT1 and is about 0.5% of the explained variance. The effect 
common to any two of the three predictors are not big either, e.g. the commonality associated 
with t5 and t6 is only 0.7% of the total variance in CRIT1 and is about 1 .85% of the explained 
variance. The major contribution to CRIT1 is from the three-way commonality, which is 20.6% 
of the total variance and over 50% of the explained variance. 

INSERT TABLE 3 ABOUT HERE 

It is interesting to see from the bottom of table 3 that t5, t6, and t9, explained 24.8%, 
29.4%, and 34.4% of the variance in CRIT1 respectively (they can also be find in table 2 as the 
R 2 of the multiple regressions using one predictor), and to go up table 3 and see that the unique 
explanatory ability of the three predictors are actually .2%, 2.6% and 5% only. It is found 
through commonality analysis that the three predictors share most of their predicting power. 
Among the three predictors, t9 contributed most to the variance in CRIT1. Since the three 
predictors shared most of their predicting power, t9 by itself would do a good enough job as the 
predictor for CRIT1 . 

Discussion 

From the above example, one can see the value of doing commonality analysis. It tells 
where exactly the effects fall on. Thompson (1985) and Daniel (1989) also point out that 
canonical commonality analysis honors the multivariate nature of the dependent variables, and 
examines their effects without taking them out of the multivariate context. Attempts of 
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interpreting other statistics from the canonical correlation analysis have been made in the past. 
But the canonical correlation is the only true multivariate effect. The interpretation of the pairs of 
variates involves several aspects. First, R c , the canonical correlation gives the strength of the 

relationship between variates in each pair. Second, the adequacy coefficient on a variate, which 
is the average of the squared structural coefficients of the variables that generate the variate, 
shows how strong the variables related to the variate of its own set. Third, the redundancy index 
of a variate, which is the product of the squared canonical coefficient and the adequacy 
coefficient, gives the percent of variance explained by the variables in the other set of variables 
together. Together these pieces of information portrait how the two sets of variables relate within 
and across sets. Adequacy coefficients are obviously univariate statistics. Roberts (1999) has 
cited proof that redundancy indexes are also univariate. They shouldn’t be utilized to interpret 
the multivariate result of canonical correlation analysis. Roberts also points out that the canonical 
correlation, rather than the redundancy index was maximized in the analysis, and should be the 
statistics for interpretation. 

In order to simplify the analysis, researchers want to use fewer variables to give similar 
magnitude of model effect size. According to the law of parsimony, the simpler the explanation, 
the higher the probability of replicating the result and the more likely the explanation is to be 
true. Canonical commonality analysis could also provide guideline for deletion of predictors in 
Canonical correlation analysis. In the above example, if t5 is deleted, the predicting power of U , 
will be lost, but the effect size only reduces by .002. If both t5 and t6 are deleted, the predicting 
power of U 2 and C )2 will also be lost, but the effect size only reduces to .344. In this case t9 
predicts almost as well as all three predictors together. 
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Overall, With the ability to decompose explained variance to parts associated uniquely 
and commonly with predictors, and the straightforward calculation, commonality analysis can be 
very helpful to researcher seeking to know more about the effect of their regression analysis. 
Canonical commonality analysis helps to interpret the canonical correlation analysis effect and 
sheds light to the contributions of individual predictors. It honors the multivariate context and 
preserves the level of scale of the variables and would be useful when the number of predictors is 
less than 5. 
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Table 1 



Canonical Correlation Analysis Summary on Variate 1 



Variable/ 

Statistics 




Variate 1 




Func 


Struc 


Stru 2 


T20 


-.264 


-.637 


40.58% 


T22 


-.561 


-.849 


72.08% 


T24 


-.461 


-.772 


59.60% 


Adequacy 






57.43% 








21.76% 


R c 






37.88% 


R, 






29.51% 


Adequacy 






77.90% 


T5 


-.115 


-.809 


65.45% 


T6 


-.386 


-.881 


77.62% 


Tv 


-.595 


-.952 


90.63% 
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Table 2 



R 2 of the multiple regressions of CRIT1 with at least .one of the three predictors 





Predictors 


R 2 value 


K 


t5 


.248 


R2 ,2 


t6 


.294 


Ks 


t9 


.344 


Kn 


t5, t6 


.329 


R y. 13 


t5,t9 


.353 


r ;, 3 


t6, t9 


.377 


p 2 

>’•123 


t5, t6, t9 


.379 
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Table 3 

Commonality Analysis Summary Table for CRIT1 



Unique /Commonality 


T5 


T6 


T9 


u, 


.002 






U 2 




.026 




u 3 






.050 


C 12 


.007 


.007 




C ,3 


.033 




.033 


C 23 




.055 


.055 


C 123 


.206 


.206 


.206 


Total 


.248 


.294 


.344 


Commonality 


.246 


.268 


.294 


R 2 explained by 


24.8% 


29.4% 


34.4% 
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APPENDIX A 



SPSS Syntax Used to Do the Canonical Commonality Analysis 
TITLE 'Canonical Commonality Analysis'. 



COMMENT data used is taken from Holzinger & Swineford (1937). 

COMMENT Run Canonical Analysis Using SPSS MANOVA command. 

MANOVA 

t20 t22 t24 WITH t5 t6 t9 
/PRINT=SIGNIF (MULTIV EIGEN DIMENR) 

/DISCRIM= (STAN CORR ALPHA (.999)). 

COMMENT create the z-scores for the six variables. 

DESCRIPTIVES 

VARIABLES=t20 t22 t24 t5 t6 t9 /SAVE 
/STATISTICS=MEAN STDDEV MIN MAX . 

COMMENT compute and save the criterion variate scores for canonical function 1. 

COMPUTE CRIT1 = -.264 * zt20 + -.561 * zt22 + -.461 * zt24. 

EXECUTE . 

COMMENT Run regressions of all possible combinations of the three predictors to find the part 

correlations for commonality analysis. 

REGRESSION 
/MISSING LISTWISE 
/CRITERIA=PIN(.05) POUT(.IO) 

/NOORIGIN 
/DEPENDENT critl 
/METHOD=ENTER t5 . 

REGRESSION 
/MISSING LISTWISE 
/CRITERIA=PIN(.05) POUT(.IO) 

/NOORIGIN 
/DEPENDENT critl 
/METHOD=ENTER t6 . 

REGRESSION 
/MISSING LISTWISE 
/CRITERIA=PIN(.05) POUT(.IO) 

/NOORIGIN 
/DEPENDENT critl 
/METHOD=ENTER t9 . 

REGRESSION 
/MISSING LISTWISE 
/CRITERIA=PIN(.05) POUT(.IO) 

/NOORIGIN 
/DEPENDENT critl 
/METHOD=ENTER t5 t6 . 

REGRESSION 
/MISSING LISTWISE 
/CRITERIA=PIN(.05) POUT(.IO) 

/NOORIGIN 
/DEPENDENT critl 
/METHOD=ENTER t5 t9 . 

REGRESSION 
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/MISSING LISTWISE 
/CRITERIA=PIN(.05) POUT(.IO) 
/NOORIGIN 
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